Telework "avatar work," in which people with disabilities can engage in physical work such as customer service, is being implemented in society. In order to enable avatar work in a variety of occupations, we propose a mobile sales system using a mobile frozen drink machine and an avatar robot "OriHime", focusing on mobile customer service like peddling. The effect of the peddling by the system on the customers are examined based on the results of video annotation.
translated by 谷歌翻译
Recently, extensive studies on photonic reinforcement learning to accelerate the process of calculation by exploiting the physical nature of light have been conducted. Previous studies utilized quantum interference of photons to achieve collective decision-making without choice conflicts when solving the competitive multi-armed bandit problem, a fundamental example of reinforcement learning. However, the bandit problem deals with a static environment where the agent's action does not influence the reward probabilities. This study aims to extend the conventional approach to a more general multi-agent reinforcement learning targeting the grid world problem. Unlike the conventional approach, the proposed scheme deals with a dynamic environment where the reward changes because of agents' actions. A successful photonic reinforcement learning scheme requires both a photonic system that contributes to the quality of learning and a suitable algorithm. This study proposes a novel learning algorithm, discontinuous bandit Q-learning, in view of a potential photonic implementation. Here, state-action pairs in the environment are regarded as slot machines in the context of the bandit problem and an updated amount of Q-value is regarded as the reward of the bandit problem. We perform numerical simulations to validate the effectiveness of the bandit algorithm. In addition, we propose a multi-agent architecture in which agents are indirectly connected through quantum interference of light and quantum principles ensure the conflict-free property of state-action pair selections among agents. We demonstrate that multi-agent reinforcement learning can be accelerated owing to conflict avoidance among multiple agents.
translated by 谷歌翻译
The long-standing theory that a colour-naming system evolves under the dual pressure of efficient communication and perceptual mechanism is supported by more and more linguistic studies including the analysis of four decades' diachronic data from the Nafaanra language. This inspires us to explore whether artificial intelligence could evolve and discover a similar colour-naming system via optimising the communication efficiency represented by high-level recognition performance. Here, we propose a novel colour quantisation transformer, CQFormer, that quantises colour space while maintaining the accuracy of machine recognition on the quantised images. Given an RGB image, Annotation Branch maps it into an index map before generating the quantised image with a colour palette, meanwhile the Palette Branch utilises a key-point detection way to find proper colours in palette among whole colour space. By interacting with colour annotation, CQFormer is able to balance both the machine vision accuracy and colour perceptual structure such as distinct and stable colour distribution for discovered colour system. Very interestingly, we even observe the consistent evolution pattern between our artificial colour system and basic colour terms across human languages. Besides, our colour quantisation method also offers an efficient quantisation method that effectively compresses the image storage while maintaining a high performance in high-level recognition tasks such as classification and detection. Extensive experiments demonstrate the superior performance of our method with extremely low bit-rate colours. We will release the source code soon.
translated by 谷歌翻译
The modern dynamic and heterogeneous network brings differential environments with respective state transition probability to agents, which leads to the local strategy trap problem of traditional federated reinforcement learning (FRL) based network optimization algorithm. To solve this problem, we propose a novel Differentiated Federated Reinforcement Learning (DFRL), which evolves the global policy model integration and local inference with the global policy model in traditional FRL to a collaborative learning process with parallel global trends learning and differential local policy model learning. In the DFRL, the local policy learning model is adaptively updated with the global trends model and local environment and achieves better differentiated adaptation. We evaluate the outperformance of the proposal compared with the state-of-the-art FRL in a classical CartPole game with heterogeneous environments. Furthermore, we implement the proposal in the heterogeneous Space-air-ground Integrated Network (SAGIN) for the classical traffic offloading problem in network. The simulation result shows that the proposal shows better global performance and fairness than baselines in terms of throughput, delay, and packet drop rate.
translated by 谷歌翻译
Measuring the semantic similarity between two sentences is still an important task. The word mover's distance (WMD) computes the similarity via the optimal alignment between the sets of word embeddings. However, WMD does not utilize word order, making it difficult to distinguish sentences with large overlaps of similar words, even if they are semantically very different. Here, we attempt to improve WMD by incorporating the sentence structure represented by BERT's self-attention matrix (SAM). The proposed method is based on the Fused Gromov-Wasserstein distance, which simultaneously considers the similarity of the word embedding and the SAM for calculating the optimal transport between two sentences. Experiments on paraphrase identification and semantic textual similarity show that the proposed method improves WMD and its variants. Our code is available at https://github.com/ymgw55/WSMD.
translated by 谷歌翻译
Our team, Hibikino-Musashi@Home (the shortened name is HMA), was founded in 2010. It is based in the Kitakyushu Science and Research Park, Japan. We have participated in the RoboCup@Home Japan open competition open platform league every year since 2010. Moreover, we participated in the RoboCup 2017 Nagoya as open platform league and domestic standard platform league teams. Currently, the Hibikino-Musashi@Home team has 20 members from seven different laboratories based in the Kyushu Institute of Technology. In this paper, we introduce the activities of our team and the technologies.
translated by 谷歌翻译
超级解决全球气候模拟的粗略产出,称为缩减,对于需要长期气候变化预测的系统做出政治和社会决策至关重要。但是,现有的快速超分辨率技术尚未保留气候数据的空间相关性,这在我们以空间扩展(例如运输基础设施的开发)处理系统时尤其重要。本文中,我们展示了基于对抗性的网络的机器学习,使我们能够在降尺度中正确重建区域间空间相关性,并高达五十,同时保持像素统计的一致性。与测量的温度和降水分布的气象数据的直接比较表明,整合气候上重要的物理信息对于准确的缩减至关重要,这促使我们称我们的方法称为$ \ pi $ srgan(物理学知情的超级分辨率生成生成的对手网络)。本方法对气候变化影响的区域间一致评估具有潜在的应用。
translated by 谷歌翻译
我们在随机多臂匪徒问题中使用固定预算和上下文(协变)信息研究最佳武器识别。在观察上下文信息之后,在每一轮中,我们使用过去的观察和当前上下文选择一个治疗臂。我们的目标是确定最好的治疗组,这是一个在上下文分布中被边缘化的最大预期奖励的治疗组,而错误识别的可能性最小。首先,我们为此问题得出半参数的下限,在这里我们将最佳和次优的治疗臂的预期奖励之间的差距视为感兴趣的参数,以及所有其他参数,例如在上下文中的预期奖励,作为滋扰参数。然后,我们开发“上下文RS-AIPW策略”,该策略由随机采样(RS)规则组成,跟踪目标分配比和使用增强反向概率加权(AIPW)估算器的建议规则。我们提出的上下文RS-AIPW策略是最佳的,因为错误识别概率的上限与预算到Infinity时的半参数下限相匹配,并且差距趋于零。
translated by 谷歌翻译
初始化时(OPAI)的一次性网络修剪是降低网络修剪成本的有效方法。最近,人们越来越相信数据在OPAI中是不必要的。但是,我们通过两种代表性的OPAI方法,即剪切和掌握的消融实验获得了相反的结论。具体而言,我们发现信息数据对于增强修剪性能至关重要。在本文中,我们提出了两种新颖的方法,即判别性的单发网络修剪(DOP)和超级缝制,以通过高级视觉判别图像贴片来修剪网络。我们的贡献如下。(1)广泛的实验表明OPAI是数据依赖性的。(2)超级缝线的性能明显优于基准图像网上的原始OPAI方法,尤其是在高度压缩的模型中。
translated by 谷歌翻译
自我监督学习中的最新作品通过以对象为中心或基于区域的对应目标进行预处理,在场景级密集的预测任务上表现出了强劲的表现。在本文中,我们介绍了区域对象表示学习(R2O),该学习统一了基于区域的和以对象为中心的预处理。 R2O通过训练编码器以动态完善基于区域的段为中心的蒙版,然后共同学习掩模中内容的表示形式。 R2O使用“区域改进模块”将使用区域级先验生成的小图像区域分组为较大的区域,这些区域倾向于通过聚类区域级特征对应对应对象。随着训练的进展,R2O遵循了一个区域到对象的课程,该课程鼓励学习区域级的早期特征并逐渐进步以训练以对象为中心的表示。使用R2O的表示形式导致了Pascal VOC(+0.7 MIOU)和CityScapes(+0.4 MIOU)的语义细分表现最先进的表现,并在MS Coco(+0.3 Mask AP)上进行了实例细分。此外,在对Imagenet进行了预审进之后,R2O预处理的模型能够超过Caltech-UCSD Birds 200-2011数据集(+2.9 MIOU)的无监督物体细分中现有的最新对象细分。我们在https://github.com/kkallidromitis/r2o上提供了这项工作的代码/模型。
translated by 谷歌翻译